R Regular Expression

常見的範例:

語法 正則表達式 範例
整數 [0-9]+ 5815
浮點數 [0-9]+.[0-9]+ 58.15
純英文字串 [A-Za-z]+ CGUIM
Email [a-zA-Z0-9]+@[a-zA-Z0-9.]+ im@mail.cgu.edu.tw
URL http://[a-zA-Z0-9./_]+ http://www.is.cgu.edu.tw/

可以用正規表示式的R函數

正規表示式的語法

逃脫字元

*\*

表示數量的語法

stringVector<-c("a","abc","ac","abbc","abbbc","abbbbc")
grep("ab*",stringVector,value=T)
## [1] "a"      "abc"    "ac"     "abbc"   "abbbc"  "abbbbc"
grep("ab+",stringVector,value=T)
## [1] "abc"    "abbc"   "abbbc"  "abbbbc"
grep("ab?c",stringVector,value=T)
## [1] "abc" "ac"
grep("ab{2}c",stringVector,value=T)
## [1] "abbc"
grep("ab{2,}c",stringVector,value=T)
## [1] "abbc"   "abbbc"  "abbbbc"
grep("ab{2,3}c",stringVector,value=T)
## [1] "abbc"  "abbbc"

表示位置的語法

stringVector<-c("abc","bcd","cde","def","abc def","bcdefg abc")
grep("^bc",stringVector,value=T)
## [1] "bcd"        "bcdefg abc"
grep("bc$",stringVector,value=T)
## [1] "abc"        "bcdefg abc"
grep("\\bde",stringVector,value=T)
## [1] "def"     "abc def"
grep("\\Bde",stringVector,value=T)
## [1] "cde"        "bcdefg abc"

運算子

stringVector<-c("03-2118800","02-23123456","0988123456",
                "07-118","0-888","csim@mail.cgu.edu.tw","csim@.","csim@",
                "http://www.is.cgu.edu.tw/")
grep("[0-9]{2}-[0-9]{7,8}",stringVector,value=T)
## [1] "03-2118800"  "02-23123456"
grep("[0-9]{10}",stringVector,value=T)
## [1] "0988123456"
grep("02|03",stringVector,value=T)
## [1] "03-2118800"  "02-23123456"
grep("[a-zA-Z0-9_]+@[a-zA-Z0-9._]+",stringVector,value=T)
## [1] "csim@mail.cgu.edu.tw" "csim@."

特殊符號

stringVector<-c("03-2118800","02-23123456","0988123456",
                "07-118","0-888","csim@mail.cgu.edu.tw","http://www.is.cgu.edu.tw/")
grep("\\d{2}-\\d{7,8}",stringVector,value=T)
## [1] "03-2118800"  "02-23123456"
grep("\\d{10}",stringVector,value=T)
## [1] "0988123456"
grep("\\w+@[a-zA-Z0-9._]+",stringVector,value=T)
## [1] "csim@mail.cgu.edu.tw"

參考資料